25 November 2015
|
|
|
|
| ST | aspA | glnA | gltA | glyA | pgm | txt | uncA |
|---|---|---|---|---|---|---|---|
| 474 | 2 | 4 | 1 | 2 | 2 | 1 | 5 |
| 61 | 1 | 4 | 2 | 2 | 6 | 3 | 17 |
| 2381 | 175 | 251 | 216 | 282 | 359 | 293 | 102 |
| 48 | 2 | 4 | 1 | 2 | 7 | 1 | 5 |
| 2370 | 1 | 4 | 2 | 2 | 6 | 5 | 17 |
Does the proportion of human cases attributed to each source change seasonally?
Did the intervention in the poultry industry work?
Is attribution related to rurality?
\[ P(\mathsf{st}) = \sum_j P(\mathsf{st} \mid \mathsf{source}_j) P(\mathsf{source}_j) \]
\[ P(\mathsf{st}) = \sum_j \underbrace{P(\mathsf{st} \mid \mathsf{source}_j)}_\text{genomic model} P(\mathsf{source}_j) \]
\[ P(\mathsf{st}) = \sum_j \underbrace{P(\mathsf{st} \mid \mathsf{source}_j)}_\text{genomic model} \underbrace{P(\mathsf{source}_j)}_\text{attribution to source} \]
Assume that Campylobacter genotypes arise from two or more homogeneous mixing populations where we have
Mutation, where novel alleles are produced.
Recombination, where the allele at a given locus has been observed before, but not in this allelic profile.
Migration between sources, where genotypes have been observed previously.
| ST | aspA | glnA | gltA | glyA | pgm | txt | uncA |
|---|---|---|---|---|---|---|---|
| 474 | 2 | 4 | 1 | 2 | 2 | 1 | 5 |
| ? | 2 | 4 | 1 | 2 | 29 | 1 | 5 |
We have a novel allele at the pgm locus.
We assume this genotype has arisen through mutation.
| ST | aspA | glnA | gltA | glyA | pgm | txt | uncA |
|---|---|---|---|---|---|---|---|
| 474 | 2 | 4 | 1 | 2 | 2 | 1 | 5 |
| ? | 2 | 4 | 1 | 2 | 1 | 1 | 5 |
| 45 | 4 | 7 | 10 | 4 | 1 | 7 | 1 |
| 3718 | 2 | 4 | 1 | 4 | 1 | 1 | 5 |
We have seen this pgm allele before, but haven't seen this genotype.
We assume it arose through recombination.
| ST | aspA | glnA | gltA | glyA | pgm | txt | uncA |
|---|---|---|---|---|---|---|---|
| 474 | 2 | 4 | 1 | 2 | 2 | 1 | 5 |
| ? | 2 | 4 | 1 | 2 | 2 | 1 | 5 |
This is just 474. We've seen it before, but possibly not on this source.
We assume it arose through migration.
\[ \phi(y \mid k,X) = \sum_{c\in X} \frac{M_{S_ck}}{N_{S_c}} \prod_{l=1}^7 \left\{\begin{array}{ll} \mu_k & \text{if $y^{l}$ is novel,}\\ (1-\mu_k)R_k\sum_{j=1}^K M_{jk}f^l_{y^lj} & \text{if $y^{l}\neq c^l$}\\ (1-\mu_k)\left[1 - R_k(1 - \sum_{j=1}^K M_{jk}f^l_{y^lj})\right] & \text{if $y^{l}=c^l$} \end{array} \right. \]
\[ P(\mathsf{st}) = \sum_j \underbrace{P(\mathsf{st} \mid \mathsf{source}_j)}_\text{genomic model} \underbrace{P(\mathsf{source}_j)}_\text{attribution to source} \]
\[ P(\mathsf{st} \mid \underbrace{t, \mathbf{x}}_\text{covariates}) = \sum_j \underbrace{P(\mathsf{st} \mid \mathsf{source}_j)}_\text{genomic model} \underbrace{P(\mathsf{source}_j \mid t, \mathbf{x})}_\text{attribution with covariates} \]
Nested within each source \(j\) we have \[ \begin{aligned} \mathsf{logit}\left(P(\mathsf{source}_j \mid t, \mathbf{x})\right) &= \mathsf{Location}_\mathbf{x} \cdot \mathbf{1}\left[t \geq 2008\right] \cdot \mathsf{Month}_t + \epsilon_{\mathbf{x}t}\\ \epsilon_{\mathbf{x}t} &\sim \mathsf{Normal}(\rho \epsilon_{\mathbf{x}(t-1)}, \sigma^2) \end{aligned} \]
Covariates are estimated as a Gibbs step conditional on correlation \(\rho\), variance \(\sigma^2\) and \(P(\mathsf{source}_j \mid t, \mathbf{x})\).
\(\phi\) and \(\sigma^2\) are then updated using Gibbs conditional on the covariates and \(P(\mathsf{source}_j \mid t, \mathbf{x})\).
\(P(\mathsf{source}_j \mid t, \mathbf{x})\) are block updated from the full conditional, interleaved with Metropolis Hastings steps.
Urban cases tend to be more associated with poultry, and rural cases with ruminants.
There does seem to be some evidence for seasonality in attribution.
The poultry intervention in 2007 resulted in a marked reduction in poultry related cases in urban areas, less strong in rural areas.
Very few cases associated with water or other sources.
Limitation: Genomic model assumed constant through time.
Slides: http://bit.ly/1M5PP4O
Twitter: @jmarshallnz
Github: jmarshallnz